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ABSTRACT 



A system for adaptively transporting video over networks 
wherein the available bandwidth varies with time. The 
system comprises a video/audio codec that functions to 
compress, code, decode and decompress video streams that 
are transmitted over networks having available bandwidths 
that vary with time and location. Depending on the channel 
bandwidth, the system adjusts the compression ratio to 
accommodate a plurality of bandwidths ranging from 20 
Kbps for POTS to several Mbps for switched LAN and ATM 
environments. Bandwidth adjustability is provided by offer- 
ing a trade off between video resolution, frame rate and 
individual frame quality. The system generates a video data 
stream comprised of Key, P and B frames from a raw source 
of video. Each frame type is further comprised of multiple 
levels of data representing varying degrees of quality. In 
addition, several video server platforms can be utilized in 
tandem to transmit video/audio information with each video 
server platform transmitting information for a single 
compression/resolution level. 

11 Claims, 13 Drawing Sheets 
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SYSTEM FOR ADAPTIVE VIDEO/AUDIO bandwidth. A common solution is to select a target trans- 

TRANSPORT OVER A NETWORK mission bandwidth as the lowest common bandwidth for all 

recipients. This solution results in poorer quality for users 

FIELD OF THE INVENTION with access to higher bandwidth. Another common solution 

n , , . ( . , . n * * „w ~p 5 is t0 pump in video data based on the capabilities of the 

Jlie present invention relates generally to transport ol r . r „ . , r . 

. , r , ,• • r * , J„ n ~A „„, source, thus allowmg the downstream network routers to 

video and audio information over networks and more par- , ' . * ™ . . . , . 

*• i i i * t i . * ♦ «p„ n „A;~ drop the packets as needed. This solution results in wasted 

ticularly relates to adapting the transport or video and audio r v 

information over IP networks having varying bandwidth netw orK resources. 

capacities. SUMMARY OF THE INVENTION 

10 

BACKGROUND OF THE INVENTION ^ present invention is a system for adaptively trans- 
porting video over networks wherein the available band- 
Traditionally, most, if not all, of the content found on the width varies with time. The present invention has applica- 
Internet today is text and image based. While video content tion to any type of network including those that utilize the 
can add tremendous new excitcnent and value to the Internet ^ Internet Protocol (IP) such as the Internet or other TCP/IP 
in the form of advertising, online training, video conferenc- networks. The system comprises a video/audio codes or 
ing and many other functions, these types of applications are coder/decoder that functions to compress, code, decode and 
rare today. Even when they do exist, the quality of the decompress video streams that are transmitted over net- 
overall experience is poor. In addition, most often, the cost works having available, bandwidths that vary with time and 
is prohibitively too high for wide scale deployment. 2o location. Depending on the channel bandwidth, the system 
The Internet and other TCP/IP networks are challenging adjusts the compression ratio to accommodate, a plurality of 
environments in which to deliver streaming real time audio/ bandwidths ranging from 20 Kbps for plain old telephone 
video. The bandwidth available over a connection at any service (POTS) to several Mbps for switched LAN and ATM 
particular instant varies with both time and location. This environments. Bandwidth adjustability is provided by offer- 
variation in bandwidth causes entire packets containing 25 ing a trade off between video resolution (e.g., 160x120, 
substantial audio/video content to be lost. In addition, the 320x240, 640x480), frame rate (e.g., B30 fps, 15 fps, 7.5 
latency through the network, causing the video that is fps) and individual frame quality. This flexibility is usefuil 
ultimately displayed to 'jitter* or lose clarity at the client for different applications that stress different requirements. 
These factors may be tolerable for file transfer traffic where The system functions to generate a prioritized video data 
jitter does not matter since high level protocols correct for 3Q stream comprising multiple levels from a raw source of 
errors and losses. They do, however, make data delivery video. This video stream is stored in a file and accessed by 
difficult for real time audio/video streaming applications. the video server when servicing clients. In operation, the 
A major challenge in transporting video over TCP/IP video client only receives a subset of the levels. The levels 
networks is that video requires much higher bandwidth than are chosen to have a suitable data content to match that of 
most other types of data objects. To illustrate, consider that 35 the network connection. This permits a better fit between 
the raw data required for a one hour movie shown at a network bandwidth consumed and video image quality, 
resolution of 640x480 at 30 fps is approximately 100 GB, To Each of the levels is built on top of the previous levels, with 
transmit this uncompressed raw video over a 10 Mbps the higher levels providing incremental information not 
Ethernet link would take approximately 22 hours. The present in the lower levels. This ensures that bandwidth is 
transmit the same video over a 28.8 Kbps modem would 40 not wasted on the client end or on the encoder/server side, 
take approximately 320 days, Thus, it is clear, that for The system generates the video stream that is sent to the 
practical purposes, video must be heavily compressed for client such that a loss of any individual packet on the 
real time video transmission over a network have finite network will not cause sustained deged quality at the client 
speed. The scaleable compression performed by the system is 
Another major challenge to transporting video over TCP/ 45 suitable for transparent video within an Internet environment 
IP networks or any network generally, is coping with van- characterized by large diversity and heterogeneity. The 
able bandwidth. Two aspects of bandwidth variation include system functions to match the image quality of the video 
time dependent bandwidth variation and site dependent data being transported with the wide variations in available 
bandwidth variation. Time dependent bandwidth variation is network bandwidth. In addition, the system can adjust the 
due to changes in network traffic because the network is a 50 video data to match the differences in available computing 
shared resource. Site dependent bandwidth variation arises power on the client computer system. The system, utilizing 
from the fact that the video data stream is, in many video 'best effort' protocols such as those found on the Internet, 
related applications, sent to multiple sites. The connections adapts to the time varying nature of the available bandwidth, 
from the server to each site typically have varying available There is therefore provided in accordance with the present 
bandwidths. For example, even within the same building, 55 invention a method of transporting video over a network 
one recipient may be on a local area network (LAN) while channel, comprising the steps of compressing a raw video 
another recipient may be connected via an integrated ser- source into a plurality of frames, each frame comprising a 
vices digital network (ISDN) line. Thus, it would be usefuil plurality of levels, each level corresponding to a particular 
if available bandwidth was dynamically measured and this degree of compression, estimating the bandwidth of the 
measurement used to provide optimum quality video to each 60 network channel, selecting one of the plurality of levels of 
site. This would mniinimize any waste of network resources each frame to transmit over the network channel in accor- 
and reduce CPU resource usage. dance with the bandwidth estimate whereby the level 
Current video transport or delivery systems essentially selected optimizes the use of the bandwidth of the network 
ignore the problems of transporting video over TCP/IP channel, and sending the selected level of each frame over 
networks as discussed above. These systems provide a 65 the network channel. 

simple control to the sender or creator of the video stream The step of compressing comprises the step of compress- 
that functions to select ' a particular video transmission ing the raw video source into a plurality of different types of 
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frames, each frame type containing different amount of 
video content information, the plurality of different types of 
frames grouped so as to form a video stream consisting of a 
plurality of group of pictures (GOP) sequences. The step of 
compressing comprises the step of compressing the raw 5 
video source into Key, P and B type frames, the Key, P and 
B frames generated so as to form a video steam consisting 
of a plurality of group of pictures (GOP) sequences. 

There is also provided in accordance with the present 
invention a method of transporting video from a video server 30 
to a video client over a network channel, comprising the 
steps of compressing data from a raw video source so as to 
generate a plurality of frames, each frame being of a 
particular frame type, each frame type containing a particu- 
lar amount of video content information, each frame com- 15 
prising a plurality of levels, each level corresponding to a 
particular degree of compression, estimating the bandwidth 
of the network channel, deter the amount of video informa- 
tion waiting to be displayed at the video client, selecting one 
of the plurality of levels of each frame to send over the 20 
network channel in accordance with the bandwidth estimate 
whereby the level selected optimizes the use of the band- 
width of the network channel, choosn which frames having 
a particular frame type to send over the network channel in 
accordance with the amount of video information waiting to 2 s 
be displayed at the video client, and sending the chosen 
frames having a particular frame type and of the selected 
level over the network channel. 

Further, there is provided in accordance with the present 
invention a video server for transporting video from a video 30 
source over a network channel to a video client, the video 
source consisting of a plurality of frames of video data, each 
frame of video data consisting of multiple compression 
levels and being of a particular type, the video server 
comprising receiver means for inputting frames of video 35 
data from the video source, sending means coupled to the 
receiver means, the sending means for determining which 
compression level within the frame and which frames hav- 
ing a particular type to transmit in accordance with the 
estimated available bandwidth of the network channel, the 40 
sending means for encapsulating the frames of video data 
into a plurality of packets for transmission over the network 
channel, and a controller for managing the operation of the 
receiver means and the sending means whereby the rate of 
transmission of the sending means is maintained so as to 45 
match the available bandwidth of the network channel. 

In addition, the sending means comprises a rate control 
unit for measuring the available bandwidth of the network 
channel, a frame selector for inputting video frame data 
output by the receiver means, the frame selector outputting 50 
frames of a particular compression level in accordance with 
the bandwidth measured by the rate control unit, a packet 
generator for inputting video frame data output by the frame 
selector, the packet generator for encapsulating the video 
frame data into a plurality of packets for transmission, the 55 
packet generator determining which frames having a par- 
ticular type are to be transmitted, a packet transmitter for 
placing onto the network channel the plurality of packets 
output by the packet generator, and a receiver for receiving 
acknowledgments sent by the video client over the network 60 
channel in response to packets received thereby. 

There is further provided in accordance with the present 
invention a method of measuring the bandwidth of a net- 
work channel connecting a sender to a receiver, the method 
comprising the steps of the sender transmitting a plurality of 65 
packets to the receiver over the network channel to yield a 
particular number of bytes online, the receiver transmitting 



to the sender acknowledgments in response to the receipt of 
the packets by the receiver, measuring the reception band- 
width of the packets by the receiver, increasing the number 
of bytes online until the rate of increase of the reception 
bandwidth decreases to within a predetermined threshold, 
and estimating the bandwidth of the network channel to be 
the reception bandwidth at the receiver. 

In addition, there is provided in accordance with the 
present invention a method of maintaining a maximum 
number of bytes online in a network channel connecting a 
sender to a receiver, the network channel having a particular 
available bandwidth, the method comprising the steps of 
determining the number of bytes sent (BytesSent) to the 
receiver utilizing sender related data concerning the previ- 
ous packet sent and the last packet sent, determining the 
number of bytes received (BytesRec) by the receiver utiliz- 
ing receiver related data concerning the previous packet 
received and the last packet received, calculating the sending 
rate (SendRate) in accordance with the following equation 



SendRate = - 



BytesSent 



TimeToSend(PreviousResp) - TimeToSend(LasiResp) ' 

calculating the receiving rate (RecRate) in accordance with 
the following equation 



RecRate = 



BytesRec 



TimeToRec(PreviousResp) - TimeToRec(LastResp) 



comparing the sending rate to the receiving rate, increasing 
the sending rate if the sending rate is less than or equal to the 
receiving rate, and decreasing the sending rate if the sending 
rate is greater than the receiving rate. 

There is also provided in accordance with the present 
invention a method of transporting video from a video server 
to a video client over a network channel, comprising the 
steps of compressing data from a raw video source so as to 
generate a plurality of frames, each frame being of a 
particular frame type, each frame type containing a particu- 
lar amount of video content information, each frame com- 
prising a plurality of levels, each level corresponding to a 
particular degree of compression, estimating the bandwidth 
of the network channel, determining the amount of video 
information waiting to be displayed at the video client, 
selecting one of the plurality of levels of each frame to send 
over the network channel in accordance with the bandwidth 
estimate whereby the level selected optimizes the use of the 
bandwidth of the network channel, choosing which frames 
having a particular frame type to send over the network 
channel in accordance with the amount of video information 
waiting to be displayed at the video client, sending the 
chosen frames of a type containing a higher amount of video 
data content and of a selected level over the network channel 
utilizing a reliable communication protocol, and sending the 
chosen frames of a type containing a lower amount video 
data content and of a selected level over the network channel 
utilizing an unreliable communication protocol. 

Still farther, there is provided in accordance with the 
present invention a video server system for transporting 
video from a plurality of video sources over a network 
channel to a video client, each video source consisting of a 
plurality of frames of video data, each frame of video data 
consisting of a single compression level and being of a 
particular type, the video server system comprising a plu- 
rality of video servers, each video server associated with a 
single video source at a particular compression level, each 
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video server comprising receiver means for inputting fames 
of video data from the video source associated with that 
particular video server, sending means coupled to the 
receiver means, the sending means for determining which 
frames having a particular type to transmit in accordance 5 
with the available bandwidth of the network channel, the 
sending means for encapsulating the frames of video data 
into a plurality of packets for transmission over the network 
channel, a controller for managing the operation of the 
receiver means and the sending means, and a rate controller 1Q 
for determining which video server to utilize for transmis- 
sion of video data based on the available bandwidth of the 
network channel. 

The sending means comprises means for interfacing the 
video server to the rate controller, a bandwidth measurement 
unit for measuring the available bandwidth of the network 15 
channel, a packet generator for inputting video frame data 
output by the receiver means, the packet generator for 
encapsulating the video frame data into a plurality of packets 
for transmission, the packet generator determining which 
frames having a particular type are to be transmitted, a 20 
packet transmitter for placing onto the network channel the 
plurality of packets output by the packet generator, and a 
receiver for receiving acknowledgments sent by the video 
client over the network channel in response to packets 
received thereby. 25 

BRIEF DESCRIPTION OF THE DRAWINGS 

The invention is herein described, by way of example 
only, with reference to the accompanying drawings, 
wherein: ( 30 

FIG. 1 is a high level block diagram illustrating the 
adaptive video transport system of the present invention 
including the video compression/file generator, video server 
and video client; 

FIG. 2 is a high level block diagram illustrating the video 35 
. server portion of the present invention in more detail; 

FIG. 3 is a high level block diagram illustrating the video 
client portion of the present invention in more detail; 

FIG. 4 is a block diagram illustrating an example group 
of pictures (GOP) comprising a key frame and a plurality of 40 
P and B frames; 

FIG. 5 is a diagram illustrating the five levels of video 
data that make up a Key fame as stored in the file format of 
the present invention; 

FIG. 6 is a diagram illustrating the five levels of video 45 
data that make up a P frame as stored in the file format of the 
present invention; 

FIG. 7 is a diagram illustrating the five levels of video 
data that make up a B frame as stored in the file format of 
the present invention; 50 

FIG. 8 is a diagram illustrating a sample group of pictures 
sequence composed of Key, P and B frames making up a 
video stream; 

FIG. 9 is a high level diagram illustrating the sender 55 
portion of the video server in more detail; 

FIG. 10 is a graph illustrating the receiver bit rate versus 
the number of bytes online; 

FIG. 11 is a high level flow diagram illustrating the scan 
phase of the bandwidth measurement portion of the present go 
invention; 

FIG. 12 is a high level flow diagram illustrating the fixed 
phase of the bandwidth measurement portion of the present 
invention; 

FIG. 13 is a high level flow diagram illustrating the 65 
method of selecting frames to be transmitted performed by 
the sender portion of the present invention; 
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FIG. 14 is a high level flow diagram illustrating the 
method of sending a packet performed by the sender portion 
of the present invention; and 

FIG. 15 is a high level block diagram illustrating an 
alternative embodiment of the adaptive video transport 
system of the present invention including the video 
compression/file generator, multi-platforn video server and 
video client. 

DETAILED DESCRIPTION OF THE 
INVENTION 

Notation Used Throughout 
The following notation is used throughout this document. 



Term 


Definition 


AVI 


Audio Video Interleaved 


CPU 


Central Processing Unit 


GOP 


Group of Pictures 


GUI 


Graphical User Interface 


IP 


Internet Protocol 


ISDN 


Integrated Services Digital Network 


LAN 


Local Area Network 


MPEG 


Motion Picture Expert Group 


POTS 


Plain Old Telephone Service 


RSVP 


Reservation Protocol 


TCP 


Transmission Control Protocol 


UDP 


User Datagram Protocol 



Note that throughout this document, the term video is 
meant to encompass both video data and audio data. 

System Overview 

The present invention is a system for adaptively trans- 
porting video and audio over networks wherein the available 
bandwidth varies with time. The invention has application to 
any type of network including those that utilize the Internet 
Protocol (IP) such as the Internet or any other TCP/IP based 
network. A high level block diagram illustrating the adaptive 
video transport system of the present invention is shown in 
FIG. 1. The system, generally referenced 10, comprises a 
video compression/file generator 14, video server 18 and 
one or more video clients 22. Only one video client is shown 
for clarity sake. 

The video compression/file generator 14 in combination 
with the video client 22 comprise a video/audio codec or 
coder/decoder that functions to compress, code, decode and 
decompress video streams that are transmitted over the 
network 20 into a compressed video and audio file. The 
compressed file may be in any suitable format such as Audio 
Video Interleaved (AVI) format. Note that the network may 
comprise any type of network, TCP/IP or otherwise includ- 
ing the Internet. The generation of the compressed video and 
audio file 16 can be performed either online or off-line. 
Typically, the video and audio file is generated off-line. Note 
that, any suitable method of video compression can be 
utilized in the present invention such as described in con- 
nection with the Motion Pictures Expert Group (MPEG)-l, 
MPEG-2 or MPEG-4 standards. 

One important aspect of the invention is that although the 
available bandwidth of the network may vary with time and 
location, the quality of the transmitted video is varied in 
accordance with the available bandwidth. Depending on the 
channel bandwidth, the system adjusts the compression ratio 
to accommodate a plurality of bandwidths ranging from 20 
Kbps for plain old telephone service (POTS) to several 
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Mbps for switched LAN environments. Bandwidth adjust- 
ability is provided by offering a trade off between video 
resolution (e.g., 160x120, 320x240, 640x480), frame rate 
(e.g., 30 fs, 15 fs, 7.5 fps) and individual fame quality. This 
flexibility is useful for different applications that stress 5 
different requirements. 

The system functions to generate a prioritized video data 
stream comprising multiple levels from a raw source of 
video 12. This video steam is stored in a file (compressed 
video and audio file 16 in FIG. 1) and accessed by the video 10 
server IS when servicing clients 22. In operation, the video 
client only receives a subset of the levels that form the video 
and audio file 16. The levels are chosen to have a suitable 
data content to match that of the network connection 
between server and client. This permits a better fit between 35 
network bandwidth consuned and video image quality. Each 
of the levels is built on top of the previous levels, with the 
higher levels providing incremental information not present 
in the lower levels. This ensures that bandwidth is not 
wasted on the client end or on the encoder/server side. The 20 
system generates the video stream that is sent to the client 
such that a loss of any individual packet on the network will 
not cause sustained degraded quality at the client. 

The scaleable compression performed by the system is 
suitable for transparent video within an Internet environment 25 
characterized by large diversity and heterogeneity. The 
system functions to match the image quality of the video 
data being transported with the wide variations in available 
network bandwidth. In addition, the system can adjust the 
video data to match the differences in available computing 30 
power on the client computer system. The system, utilizing 
'best effort' protocols such as those found on TCP/IP 
networks, adapts to the time varying nature of the available 
bandwidth. 

During the transport of video data, the server process 
functions to employ an adaptive congestion control method. 
The method estimates the network bandwidth or link capac- 
ity and adjusts the amount of video data to be sent over the 
link accordingly. The system of the present invention can be 4Q 
adapted to exploit the bandwidth reservation (RSVP) pro- 
tocol and quality of service features of TCP/IP networks that 
are currently evolving. 

A high level block diagram illustrating the video server 
portion of the present invention in more detail is shown in 45 
FIG. 2. The video server 18 comprises one or more receivers 
30, one or more senders 32 and a controller 34. During 
operation, a receiver instance is created for each request for 
a different video object. The data input to the receiver may 
be provided from an AVI file data file, for example. The 50 
video data file may be located on the same computer as the 
video server or may be located on a remote computer. The 
video data file can be stored on a single computer, e.g., video 
server, or on multiple platforms, e.g., multiple video servers, 
as described in more detail below. In this case, the video data 55 
is transmitted over a network that connects the remote video 
data and the video server. Each instance of the receiver 30 
functions to receive data from the video data file that was 
previously generated by the video compression/file genera- 
tor module 14. 60 

The sender functions to accept video frame data from the 
receiver and encapsulate the video data into packets for 
transmission of the network to the client. Each client that 
requests a connection to be established causes an instance of 
the sender to be created. Requests for multiple video sources 65 
from the same client cause additional instances of the sender 
to be created. The sender functions to assemble packets for 
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transmission from the video source data input to the receiver. 
The packets are formed on the basis of the current choice for 
the level of video transmission quality. Based on bandwidth 
measurements, the sender determines the appropriate level 
of quality to transmit to the client to best match the available 
bandwidth. Assembled packets are sent to the network for 
delivery over the network connection to the video client(s). 

The sender also measures the available bandwidth of the 
network connection between the video server and the video 
client. As described in more detail, the sender utilizes the 
bandwidth measurements to determine the appropriate video 
quality level to send over the connection. If too low a video 
quality is chosen then network bandwidth is wasted and a 
better picture could be hand the client display. On the other 
hand, if too high a video level is chosen then too much data 
may become lost or computed which also causes the quality 
of the picture on the client display to suffer. 

The controller 34 functions to manage the plurality of 
receivers, the plurality of senders, the assembly of packets 
from the video source file, delivery of the packets over Me 
network connection and measurement of the bandwidth of 
the network connection. The sender is described in more 
detail hereinbelow. 

A high level block diagram illustrating the video client 
portion of the present invention in more detail is shown in 
FIG. 3. The video client 22 comprises a packet receiver 50, 
packet decoder 52, a display generator 54 and a transmitter 
51. The packet receiver functions to receive video packets as 
they come in from the network connection. The video stream 
data is removed and input to the packet decoder 52. The 
packet decoder functions to decode and decompress the 
video data stream and sends the decoded/decompressed 
video stream to the display generator 54. The display 
generator functions to prepare the video data for actual 
transmission to and display on the host computer's display 
subsystem. In addition, the packet decoder functions to 
generate acknowledges in response to the reception of 
packets from the video server. The acknowledges, in addi- 
tion to other status information, are sent back to the video 
server via the transmitter 51. 

Video and Audio File Generation and Format 

The generation of the video source file, e.g., video and 
audio file 16 (FIG. 1), and its internal format will now be 
described in more detail. As previously described, the video 
source file used by the video server to generate the video 
stream that is sent over the network connection to the client 
is created by the video compression/file generator 14 (FIG. 
1). The input to the compression/generator is a raw video 
source 12. The raw video source can be, for example, a non 
compressed AVI file, a non compressed QuickTime file or a 
compressed MPEG-1 audio/video file. 

The function of the video compression/file generator is to 
compress the raw video source into multiple levels of 
varying quality. In particular, the raw video source is com- 
pressed into three types of data objects commonly referred 
to as frames. The three types of frames include Key frames, 
P frames and B frames. These frames are similar to the I 
frames, P frames and B frames, respectively, as described in 
the MPEG-1 specification standard (officially designated as 
ISO/IEC 11172) and the MPEG-2 specification standard 
(officially designated as ISO/IEC 13818). 

The compressed video stream that is sent to the client 
comprises a plurality of data units termed 'groups of pic- 
tures' or GOPs. A block diagram illustrating an example 
group of pictures (GOP) comprising a key frame and a 
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plurality of P and B frames is shown in FIG. 4. A group of 
pictures or GOP comprises a sequence of frames made up of 
a combination of Key, P and B frames. Each GOP has a 
single Key fame as the first frame which is followed one or 
more P and B frames. 

P frames are dependent on other frames in that they 
contain incremental changes to video data that was delivered 
previously either in a Key frame or another P frame. B 
frames are also dependent on other frames in they contain 
incremental changes to video data that was delivered pre- 
viously either in a Key frame or a P frame. Note that B 
frames never contain data that modifies a previous B frame. 
Therefore, a B fame may be lost during transmission without 
having any effect on the following frames in the GOP 
sequence. 

With reference to FIG. 4, the example GOP is shown 
comprising a Key frame 60, three B fames 62, 66, 70 and 
three P frames 64, 68, 72. Each GOP typically represents a 
particular unit or chunk of video information such as a scene 
in the video. For example, depending on the compression 
technique used, drastic scene changes may trigger the gen- 
eration of a new GOP headed by a new Key frame. The 
video stream, as shown by the arrow, is made up of a 
sequence of GOPs transmitted one after the other. Each of 
the three types of frames will now be described in more 
detail. 

Key frames are constructed so as to incorporate all the 
video information that is essential for the decoding and 
display of P and B frames. Key frames typically are the 
largest in terms of data size of the three frames. It is possible 
that only partial information from the key frames gets 
delivered to the client. If Key frames are lost or arrive 
damaged, the subsequent P and B frames cannot be used as 
they build on the data contained in the Key frame. 

The video data incorporated into P frames includes data 
that is predicted based on a previous Key frame or a previous 
P frame. The information that is included within a P frame 
is mainly the motion estimation information which is essen- 
tial for the decoding and display of the P and B frames. In 
the event that Key frame information is missing, i.e., a Key 
frame was skipped or lost, all the subsequent P frames based 
on that particular K frame will be ignored in order to prevent 
visual artifacts. The video server utilizes the fact that partial 
Key frame information is missing, based on feedback from 
the video client, to skip sending subsequent P frames that are 
based on the corrupted or lost Key frame in order to 
conserve bandwidth. 

'JTie video data incorporated into B frames includes 
motion estimation information that is based on the informa- 
tion that was previously sent either in a Key frame or a P 
frame. Note that B frames are never based on a previously 
sent B frame. When certain Key frame or P frame data is 
missing, i.e., a Key or P frame was skipped or lost, all the 
B frame data subsequent to the lost frame is slipped by the 
video server in order to conserve bandwidth. 

The raw video source is compressed into multiple types of 
frames comprised of video data having varying degrees of 
quality since the network cannot guarantee any particular 
bandwidth or an error free network connection. Thus, these 
multiple frame types can be assigned varying degrees of 
importance or priority. The most important of all the frame 
types are the Key frames which are assigned the highest 
priority. Being the most important, key frames are sent using 
a reliable mechanism. Such a reliable mechanism includes 
using a network protocol such as TCP or rehab le UDP. 
Reliable UDP refers to utilizing UDP, a basically unreliable 
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protocol, in combination with a reliable mechanism that sits 
at a higher layer in the communication stack such as the 
Application Layer. The upper communication levels ensure 
that packets are delivered to the client. 

5 The second most important frame type are the P frames 
which are transmitted using a semi reliable protocol such as 
reliable UDP as described above. If P frames are lost or 
corrupted en route to the video client, the video server may 
or may not resend them. For example, if too much time has 

10 passed, replacement packets would arrive at the client too 
late for display. 

The least important frame type are the B frames which are 
sent using an unreliable protocol such as UJDP. The B frame 
data may or may reach the video client due to the condition 

35 of the network connection between the server and the client 
Upon arrival at the client of B frame data, the client 
determines whether it is useful and should be displayed. If 
the client determines that the B frame is not usable, an 
interpolgion mechanism is used to improve the video qual- 

20 ity. 

As described previously, the video steam stored in the 
video and audio source file (compressed video and audio file 
16 in FIG. 1), is made up of three type of frames, i.e., Key 

25 framnes, P frames and B frames, that are grouped into 
sequences of GOPs. In addition, each frame type is filer 
broken down into multiple levels of detail. In the example 
protocol and file format disclosed herein, each frame type is 
further broken down into five different video data levels, 

3Q numbered 1 through 5. Level 1 contains the least amount of 
data which represents the lowest video quality and level 5 
contains the most amount of data representing the highest 
quality of video. 

Every frame (Key, P and B frames) output by the video 

35 compression/file generator is composed of data from all five 
levels. Thus, the video source file contains data representing 
a broad variation in output video quality. The video 
compression/file generator functions to assemble GOPs each 
having a particular combination of Key, P and B frames. 

40 Thus, some GOPs may have fewer or more P and B frames. 
Each frame, however, contains video data for each of the five 
quality resolution levels. However, for each GOP, the video 
client only receives data corresponding to a single level. The 
video server determines for each GOP the appropriate level 

45 of data to send to the client. Once a video quality level is 
chosen by the video server, it is used for the entire GOP. 
Adjacent GOPs can be comprised of different level data 
However, data of different levels cannot be sent within a 
GOP. 

50 A diagram illustrating the five levels of video data that 
make up a Key frame as stored in the file format of the 
present invention is shown in FIG. 5. A sample Key frame 
and each of its five levels of data of varying resolution and 
quality is shown in the Figure. Each level is shown with a 

ss corresponding data size. The data size for the levels is 0.5 
KB, 1 KB, 3 KB, 7 KB, 15 KB which correspond to levels 
1, 2, 3, 4, 5, respectively. Thus, the total data size of the 
sample Key frame for all five levels is 26.5 KB. The data 
sizes in FIG, 5 and the subsequent Figures represent an 

60 example file and are for illustration purposes only. However, 
the relative sizes of the data for each of the levels does 
increase when going from level 1 towards level 5. This is to 
be expected since level 5 contains the highest quality video 
data. 

65 A diagram illustrating the five levels of video data that 
make up a P frame as stored in the file format of the present 
invention is shown in FIG. 6. A sample P frame and each of 
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its five levels of data of varying resolution and quality is table in determining which level data to select. A different 

shown in the Figure. Each level is shown with a correspond- level bandwidth table is associated with each video source 

ing data size. The data size for the levels is 0.1 KB, 0.2 KB, file. The level bandwidth table contains an entry for each of 

0.5 KB, 1 KB, 2.2 KB which correspond to levels 1, 2, 3, 4, the five possible compression levels. Each entry contains the 

5, respectively. Thus, the total data size of the sample P 5 average bandwidth necessary to transmit the data at that 

frame for all five levels is 4 KB. level. The frame selector chooses a level having the most 

A n . *■ a i i c -j * i information content that the network connection can support 

A diagram illustrating the five levels of vxdeo data tha usi ^ bandwidth measuremen ts performed by the rate 

make up a B frame as stored in the file format of the present CQn * o] ^ Fof exa k ^ £ andwidth 4 le for a 

invention is shown in FIG. 7. A sample B flame and each of sample video source file be as foUows 

its five levels of data of varying resolution and quality is 1fl 
shown in the Figure. Each level is shown with a correspond- 
ing data size. The data size for the levels is 0.15 KB, 0.35 
KB, 0.5 KB, 1 KB, 3 KB which correspond to levels 1, 2, 
3, 4, 5, respectively. Thus, the total data size of the sample 
B frame for all five levels is 5.0 KB. 

A diagram illustrating a sample group of pictures (GOP) 
sequence composed of Key, P and B frames making up a 
video stream is shown in FIG. 8. In this example, the video 
server has determined that level 2 data should be sent for this 

GOP. Thus, the Key frame 80, B frames 82, 86, 90 and P 20 lf> for example, the rate control unit measures the bandwidth 

frames 84, 88, 92 are shown depicting level 2 data and of the network connection to be 25 Kbps, the frame selector 

associated data size. The total data size of the GOP is 2.485 would onl y level 2 data to P acket generator Thus, 

jq3 the output of the frame selector would comprise a sequence 

of video frames wherein each video frame contains data 

Video Server Process 25 from only one of the video compression levels (level 2 data 

^ • . . - , m tfi is example). 

The video server portion of the video transport system of h ^ important t0 DOte for the very first frame or 
the present invention will now be described m more detail. packet that is to sent to the client, no bandwidth measure- 
The function of the video server is to accept a remote client men t is available. This is because, the bandwidth measure- 
connection request, retrieve a local or remote stored file and 3Q meD t method, as described in more detail below, utilizes 
transmit it to the client. Before and during the transmission transmitted packets to determine the bandwidth of the chan- 
of the video information, the server appropriately adjusts the nel. Thus, before the first packet is sent, a different mecha- 
rate of data flow from the server to the client. The rate is nisms is used to initially determine the bandwidth of the 
adjusted beforehand based on initial estimation of the band- channel. In its request to open a video source, the video 
width of the data channel. In addition, the data rate is client transmits to the server the bandwidth of the connec- 
adjusted during transmission using a bandwidth measure- lion the last time the client was connected to a server. This 
ment method that uses statistical evaluation of he connection mechanism is based on the assumption that the previous 
between the server and the client. The dynamic adjustment connection a client had with a server is similar to the present 
of the data rate by the server functions to allow the client to connection. In the case where a computer is attached to 
receive video having a quality that matches the bandwidth An J C ^ P network via two ways, e.g. dial up modem and 
capacity of the connection. Further, during the server/client 40 hl S h speed 1^, this mechanism does not provide an 
connection, the client can control the transmission of the acc ^ ate "f* 1 bandwidth estimate. 

t , c - • j . , The packet generator 102 functions to receive the frames 

data by the server, thus performing a video on demand ■ u • -a a * e i • i i ^ 

P J r & having video data from a particular compression level and 

function. , t . i . * * ■ . iL 

encapsulate them into packets for transmission over the 

The acknowledge packet sent by the client comprises an 45 ne twork. The assembled packets are output to the packet 

identification of the last received packet, its arrival time and transmitter 104 which is responsible for delivery of the 

a list of any packets missed since the transmission of the packets over the netW0 rk. In addition, to preparing packets 

previous acknowledge. from the frames received) lhe packet generator functions to 

A high level diagram illustrating the sender portion of the determine which (if any) frames to skip. Depending on the 

video server in more detail is shown in FIG. 9. The sender 50 measured bandwidth of the channel, the packet generator 

32 comprises a frame selector 100, packet generator 102, mav s ki p frames in order to reduce the transmitted bit rate, 

packet transmitter 104, rate control unit 106 and receiver Tbis occurs when the bandwidth of the network connection 

108. In operation, the frame selector functions to accept the cannot support transmission of every Key, P and B frame, 

full frame video data containing all five levels of data from The method of choosing which frames to select is described 

the receiver and select out of the five levels of data, the level 55 in more detail hereinbelow. 

of data appropriate for the connection with a parcular client. The packet generator does not send packets to the packet 

The choice of what compression level to send is made on a tter 104 until requested to do so by the packet transmitter, 

client by client basis, The frame selector used bandwidth The delivery of the packets onto the network is controlled by 

information provided by the rate control unit 106 to deter- the rate control unit 106. The rate control unit keeps track of 

mine which of the five levels of data to pass to the packet 60 the amount of video information in terms of time that' is 

generator. It is important to note that the raw video source queued for display at the client In addition, the video frames 

data may me compressed into more or less than five levels. from the video source are time stamped for synchronization 

A higher number of levels permits a finer tuning of the purposes. The rate control unit uses acknowledges received 

available bandwidth to the amount of data sent over the by the client via the acknowledgment receiver 108 to 

connection. 65 determine the next packet transmission time. Once the 

In combination with the estimated bandwidth packet transmitter is notified to send the next packet of data, 

measurement, the frame selector utilizes a level bandwidth it requests a packet from the packet generator. 



11/03/2004, EAST version: 1.4.1 



6,014,694 



13 



14 



Notification of acknowledges or ACKs received by the 
receiver 108 are also input to the packet transmitter in order 
to assure proper receipt by the client. In addition, the packet 
transmitter maintains a buffer of packets transmitted to the 
client. In the event the video server determines to resend a 
packet, the packet transmitter retrieves the packet from the 
buffer. Once receipt of a packet is acknowledged by the 
client, the packet is deleted from the buffer and the buffer 
space is freed up. 

Network Bandwidth Measurement Process 

The bandwidth measurement method as executed by the 
rate control unit 106 in the sender will now be described in 
more detail. The bandwidth measurement method actually 
comprises two separate phases. The first phase being a 
scanned phase and the second being a fixed phase. Id 
general, the bandwidth measurement-method operates by 
transmitting packets through the network connection and 
measuring the rate of reception of the packets at the client. 
A graph illustrating the receiver bit rate versus the number 
of bytes online is shown in FIG. 10. The number of bytes 
transmitted into the network pipe is increased slowly until a 
point is reached where bytes are not received any quicker at 
the client. The term bytes on line means the number of bytes 
or packets that have been transmitted by the server or the 
sender but not yet received by the client. During this scan 
phase portion of the bandwidth measurement method, the 
' immediate ' flag is set 'on' for each packet sent by the 
sender. This causes the client to send an acknowledge packet 
for every packet received. Thus, the sender should receive an 
acknowledge packet for every packet transmitted to the 
client As shown in FIG. 10 as the number of packets or bytes 
online increases, a point is reached where the client does not 
receive packets any faster. The corresponding receive rate at 
this point can be modeled as an estimate of the bandwidth of 
the network channel. 

The scan phase portion of the bandwidth measurement 
method will now be described in more detail. A high level 
flow diagram illustrating the scan phase of the bandwidth 
measurement method of the present invention is shown in 
FIG. 11. As stated previously, the immediate flag is set 'on' 
for all packets transmitted by the sender during the scan 
phase of the bandwidth measurement method. This forces 
the client to immediately send an acknowledge packet for 
every packet received over the channel. In addition, an 
acknowledge packet is also sent if the last received packet 
has a sequence number greater than the sequence number of 
the last received packet. In this case, a packet loss event has 
occurred. Also, an acknowledge packet is sent if the previ- 
ous acknowledge was sent more than an predefined time out 
period ago. For example, if the time out period is 3 seconds, 
an acknowledge is sent if the last packet was received more 
than 3 second ago. 

The acknowledge packet sent by the client contains an 
identification of the last received packet, it's arrival time and 
a list of any packets missed since the transmission of the 
previous acknowledge. Initially, the recommended bytes 
online (RecommendedBytesOnline) is set equal to the size 
of the packet (PacketSize) (step 110). In the next step, a 
single packet is sent by the sender to the client (step 112). 
The current number of bytes online (BytesOnline) is then 
calculated (step 114). The number of BytesOnline can be 
calculated since the sender has knowledge of each packet 
that is placed into the network pipe in addition to having 
knowledge of each acknowledgment received from the 
clienl Thus at any one time the sender is aware of outstand- 
ing packets still in the network pipe. Next, the number of 
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bytes online is compared to the recommended bytes online 
(step 116). The number of bytes online can be calculated 
using the sequence number of the last packet that was sent, 
as known by the sender, subtracted from the sequence 
number of the last packet acknowledged. Both these entities 
are known by the sender and thus the number of bytes online 
can be calculated. If the number of bytes online are less than 
the recommended bytes online then control returns to step 
112 and an additional packet is placed into the network pipe. 
In this manner he number of bytes online is made equal to 
the recommended bytes online. 

After the packet is sent, a time out is then set to a 
particular value, for example, 1000 (step 118), The sender 
than waits for ether an acknowledgment or a time out to 
occur (step 120) If a time out occurs then control is returned 
to step 112 since the packet is assumed lost and another 
packet is then sent (step 122). If an acknowledgment was 
received, the number of acknowledged bytes online 
(AckBytesOnline) is then calculated (step 124). The 
acknowledged bytes online is equal to the recommended 
bytes online for the last acknowledged packet. Each packet 
that is sent by the sender has associated with it a number 
indicating the recommended bytes online at the time that 
particular packet was sent. This number is stored in a log at 
the sender and associated with the particular packet that is 
transmitted. When a packet is acknowledged the recom- 
mended bytes online for that particular acknowledge packet 
is recalled. If the value of the acknowledged bytes online is 
less than the recommended bytes online than control returns 
to step 112 and another packet is placed into the network 
pipe (step 126). If the number of acknowledged bytes online 
is equal to the number of bytes online than the receiving 
bandwidth is then calculated (step 128). 

The effect of these steps is to keep the number of packets 
or bytes in the network pipe constant and in a steady state. 
The receiving bandwidth is calculated from the sending 
speed since an acknowledge is received for every packet that 
is placed into the pipe. This assumes that an acknowledge 
packet is sent immediately upon the client receiving a packet 
from the sender. If the receiving bandwidth of the pipe has 
not been exceeded then the sending rate at the sender should 
be equal to the receiving rate at the client. Thus, as long as 
the maxinum bandwidth of the channel is not exceeded, the 
sending rate can be modeled as the receiving rate and 
correspondingly the receiving bandwidth can be computed. 

It is then determined whether the receiving bandwidth has 
leveled off (step 130). With reference to FIG. 10, in this step, 
it is checked whether the number of bytes online has begun 
to level off as shown in the right most portion of the curve 
in the Figure. The leveling off of the receive bandwidth is 
detected by comparing the current receiving bandwidth to 
the average of the last five values of the receiving band- 
width. If the latest value of the receiving bandwidth is within 
5% of the average then the receiving bandwidth is consid- 
ered to have leveled off. Consequently, the bandwidth of the 
network connection is estimated to be the value of the last 
received bandwidth. If the receiving bandwidth has not 
leveled off, i.e., within 5% of the average of the previous five 
measurements, then the recommended bytes online 
(RecommendedBytesOnline) is incremented by the packet 
size (step 132). Control then returns to step 112 and an 
additional packet is placed into the network pipe. 

If the receiving bandwidth is found not to have leveled off 
it means the number of bytes online corresponds to the linear 
portion of the curve in FIG. 10. Thus, the maximum band- 
width of the network pipe has riot been reached and addi- 
tional packets can be pumped into the network channel. If 
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the receiving bandwidth has been found to have leveled off 

the recommended bandwidth (RecommendedBW) is set g dRaie _ BytesSem 

equal to the current value of the receiving bandwidth en " TimeToSend(PreviousRcsp) - TimeToSendfLastResp) 

(ReceivingBW) (step 134). The recommended bandwidth 

value is utilized by the rate control unit as an initial estimate 5 ^ ^ ^ R . & calcuIated b diyid _ 

of the bandwidth of the network connection. ^ the number of bytes received by the time lQ recdye 

The scan phase portion of the bandwidth measurement C(imeToRec) for the previous response subtracted from the 

method is used initially as a relatively crude estimate of the time t0 receive for the last response, as shown below 
bandwidth of the network channel. During steady state io 

operation of the sender portion of the video server a fixed RecRate= BytesRec 

phase bandwidth measurement method is utilized to better TimcToRec(PreviousResp)-T 1 meToRec(i J astRe S p) 
fine tune and track changes in the bandwidth of the network 

channel. A high level flow diagram illustrating the fixed The send rate is then compared to the received rate (step 

phase method of the bandwidth measurement portion of the 3 $ 154). If the sending rate is less then or equal to the receiving 

present invention is shown in FIG. 12. During the fixed ™te this means the network connection is being underuti- 

phase of the bandwidth measurement method the immediate hzed and a portion of the bandwidth remains unused. In this 

flag is set to 'off' in each packet sent by the sender. The first case > the recommended bandwidth is increased by a particu- 

step is to set a variable representing the time to send lar amount, for example 10% (step 158). If the sending rate 

(TimeToSend) equal to the current time, i.e., now (step 140). 20 * g rcater than the receiving rate this means too much data 

Next, it is checked whether the value of time to send is 15 b ™S P um Pf d m £ the ° etwork P*P e * e f nc |ing rate 

i * *u / ♦ ic *u needs to be reduce Thus, the recommended bandwidth is set 

greater than or equal to the current time (step 142). If the , , ' % _^ 

? A j . i i * *l equal to the receive rate (step 156). 

time to send is greater than or equal to the current time, a n „ 7U . t . , , . , t , \ . r ' , . , 

. & , ^, , , -™ T r Whether the bandwidth is increased or decreased, the next 

packet is sent into the network channel (step 150). Infor- 25 fc tQ determine ^ Qew ^ tQ send ^ Qn ^ 

mation about the packet is then stored in a database (step currem ^ tQ sendy tfae recommeoded bandwidth and the 

152). The information stored in the data base includes the p acketS i ze (step 160), Control then returns to step 142 

PacketID, PacketSize and the, value of the TimeToSend. A where it ^ checked whether the time has arrived to send 

new value for the TimeToSend, which represents the time another packet. 

for transmnission of the next packet, is then calculated based 30 Using this method, the sender constantly tries to utilize 
on the current value of the TimeToSend, the Recommend- the available bandwidth as efficiently as possible by keeping 
edBW and the PacketSize (step 160). Control then returns to the network pipe fully. In other words, the sender attempts 
step 142 where it is checked whether it is time to send the to maintain the number of bytes online to correspond with 
next packet. the bandwidth of the network connection. If the sender 
If the value of the time to send has not been reached, it is 35 se f s f he bandwidth of the network connection being 
then checked whether an acknowledgment has been received "^ruuhzed it increases the number of bytes onhne accord- 
(step 144). If an acknowledgment has not been received, ,n 8ft Conversely, if the sender detem.nes that the band- 
control loops back to step 142 and the time to send is ^th of the network connection is be.ng exceeded it appro- 
checked again. If an acknowledgment has been received, the P n f l J. lowere , the sendl ?S r u ate 

information contained in the acknowledgment packet is 40 * d,scus f d P^iously, the packet generator 102 (FIG. 

stored in the database (step 146). The information stored in *> recelx ? s ° r 3 P a " icular ' evel from the 

the database includes an acknowledgment packet ID &ame selector 100. Jne function of the packet generator is 

/a I n , #IT >\ *u • /t" T* t> \ j to encapsulate the video frame data into packets and transmit 

(AckPacketID), the time to receive (TimeToReceive) and iL i r i , . 44 . 4 . j,... . , 

t i , . nr a 1 \ tu i c *u them to the packet transmitter 104. In addition, the packet 

the time to acknowledgment (TimeToAck). The value of the , r ? . . 

, , , , i . iiO • .u i r <u it^ 45 generator determines which of the frames it receives to 

acknowledgment packet ID is the value of the ID or b . , . 4 . . , , . 

. r ,u i i « , lfT . encapsulate into packets and send to the packet transmitter, 

sequence number of the acknowledgment packet itself. The ~_ ^ , r . , . , / 

, . (L ( . 4 f j u »u i- ♦ The packet generator determines which frames to encapsu- 

time to receive is the time stamp generated by the client , , . . ....... . . V. 

... • c . \ c , /. j late based on the recommended bandwidth determined by 

which represents the time of arrival of the packet transmitted . control unit 106 

by the sender that the acknowledgment packet corresponds rn V^ 6 u r ° i 1 i-' •« . ^ ^ r 

to The time to acknowledgment is a time stamp generated 50 , A t hl * h f level "^diagram illustrating he method of 

by the sender representing the time the acknowledgment ^ ctm S frames '° be transmitted as performed by the 

n I i p< ra -Z»A u„ i\Z tt iA^ sender portion of the present invention is shown in FIG. 13. 

packet was received by the video server, . . . ./ . S . . . t , . , 

J Initially, the value Q is set to a particular value which 

In the next step, various entities are then calculated (step represents an optimum size of the queue within the video 

148). The number of bytes sent (BytesSent) by the sender is 55 client. The size of the queue is measured in time, i.e. 

calculated using the latest response (LastResp) and the seconds, and represents the amount of video information 

previous response (PreviousResp). The data for the previous currently queued in the video cbent ready to be displayed, 

response and the last response are generated from the The packet generator uses the current level of this queue to 

respective acknowledgment packets received by the sender. determine which of the Key, P and B frames to send to the 

Similarly, the number of bytes received by the client 60 packet transmitter. 

(BytesRec) is calculated using the information contained in First, it is checked to see whether the current level of the 

the acknowledgment packet for the previous response and client queue is less than half of Q 0 pt which represents an 

the last response. The send rate (SendRate) is then computed optimum size for the client queue (step 190). If the size of 

by dividing the number of bytes sent by the difference the client queue is less than half of this value than only a Key 

between the time to send (TimeToSend) for the previous 65 frame is sent (step 194). In this case only Key frames are 

response subtracted from the time to send for the last or sent due to time considerations. The level of the client queue 

current response, as shown below. is considered to be too short to send Key, P and B frames. 
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If the level of the client queue is found to be between half 
the optimum queue Q OFT and a full optimum queue (step 
192), both Key and P frames are sent (step 196). In this case, 
the client queue is considered to contain sufficient video 
information in terms of time to permit the transmission of 5 
Key and P frames. Lastly, if the size of the client queue is 
equal to or above the level of the optimum queue, Key, P and 
B-frames are sent to the client (step 198). In this case, the 
client queue of considered to contain enough seconds of 
video to permit enough time to send Key, P and B frames. 1Q 

As described previously, the packet generator sends pack- 
ets for transmission to the packet transmitter. For each 
packet received by the packet transmitter an appropriate 
method of communications is selected. Key-frames, for 
example, should be sent using a very reliable communica- 
tion method since they are used as the basis for both P and 15 
B frames. On the other hand, B frames may be sent using an 
unreliable communication method sine they are less impor- 
tant. 

A high level flow diagram illustrating the method of 
sending a packet performed by the packet transmitter portion 20 
of the present invention is shown FIG. 14. The first step is 
to apply a time stamp to the packet received from the packet 
generator (step 170). If the packet contains Key frame data 
(step 172) then the packet is sent using a best effort com- 
munication protocol. The best effort protocol can be an 25 
implementation of a reliable UDP which includes the video 
server retransmitting the Key frame as long as there is 
enough time for the client to receiver and display it on time. 
If the packet contains P frame information (step 176) then 
the packet is sent via a semi reliable communication proto- 30 
col (step 178). In this case, the server makes a decision based 
on the available bandwidth whether to resend the P frame 
information packet to the video client. Lastly, if the packet 
contains B frame information it is sent via a non reliable 
communication protocol such as UDP (step 180). In this 35 
case, the video server does not have an option to retransmit 
the packet if not received by the video client. 

Video Client Process 

The video client portion of the video transport system of 40 
the present invention will now be described in more detail. 
The video client is a graphical user interface (GUI) based 
process or application that functions to decode a video 
stream transmitted by the server. In general, the client 
functions as an off-line video player, capable of playing back 45 
local file streams, as well as an online video player utilizing 
a direct connection to the server. Thus, the video client 
supports both store and forward as well as real time imple- 
mentations of video over a network. The client can prefer- 
ably supply a VCR like GUI display, i.e., play, stop, fast 50 
forward, pause, etc. buttons. 

During a real time transmission of video data, the client 
reports back status ad bandwidth related information to the 
video server via a reverse channel. Based on the number of 
transmission errors as well as the number of data packets 55 
lost, as communicated via the status and bandwidth infor- 
mation sent back to the server, the server make an online 
determination regarding the quantity of data to send to the 
client. 

As described previously, this online decision forms the 60 
core of the adaptive video transport system. The video server 
makes a detention as to the bandwidth of the connection and 
the quality of the connection, i.e., rate of packet loss, based 
on the amount of information received by the video client. 
Knowledge of the amount of data that each client receives is 65 
essential to the server in order to determine the amount and 
type of data to transmit to each particular video client. 
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Multi-Platform Video Server 

In an alternative embodiment, the video server 18 of FIG. 
1 can be constructed using multiple platforms rather than a 
single platform. In this embodiment, the video server func- 
tionality is spread over multiple computer platforms. Each 
individual platform within the multi platform video server 
functions to transmit a single video compression level. A 
high level black diagram illustrating an alternative embodi- 
ment of the adaptive video transport system of the present 
invention including the video compression/file generator, 
multi-platform video server and video client is shown in 
FIG. 15. 

The adaptive video transport system, generally referenced 
200, comprises a video compression/file generator 212, a 
plurality of video servers #1 through #N 216 and one or 
more video clients 220. Only one video client is shown in 
FIG. 15 for clarity sake. The video compression/fle genera- 
tor 212 in combination with the video client 220 comprise a 
video codee or coder/decoder that functions to compress, 
code, decode and decompress videoyaudio streams that are 
transmitted over the network 218 into a plurality of com- 
pressed video/audio files. Each compressed video/audio file 
is compressed using a different compression level. Each 
individual video server platform is responsible for transmit- 
ting one of video compression levels. 

The video compression/file generator 212 functions simi- 
larly to that of the video compression/file generator 14 of 
FIG. 1 with the exception that the video compression/file 
generator of FIG. 15 generates a separate compressed video/ 
audio file for each compression level. For N compression 
levels, the video compression/file generator 212 functions to 
generate a compressed video/audio file 214 for levels 1 
through N. Considering the system described previously, 
compressed video/audio files 214 are generated for Levels 1 
through Level 5. The compressed video/audio files may be 
in any suitable format such as AVI format. The generation of 
the compressed video/audio files 214 can be performed 
either on-line or off-line. Typically the video/audio file is 
generated off-line. Note that any suitable method of video 
compression can be utilized to process the raw video data 
210 such as described in connection with the MPEG-1, 
MPEG-2 or MPEG-4 standards. 

In order to serve N bandwidth levels, where each band- 
width level represents a different quality/resolution band, N 
video servers and N compressed video/audio files are 
required. One compressed video/audio file and video server 
are associated with each bandwidth level, i.e., compression 
level. Thus, the complete video server system comprises N 
separate video server platforms each handling one compres- 
sion level. An additional platform 222 functions as a rate 
controller (bandwidth controller) which performs the scan 
phase and fixed phase bandwidth measurement methods, 
frame selection method and packet transmission method as 
described previously in connection with FIGS. 11 through 
14. The rate controller 222 functions as a bandwidth con- 
troller executing the bandwidth measurement methods 
described earlier and is operative to select which of the video 
servers #1 through #N to transmit to the video client 220, For 
each client, data from only one video server is sent at any 
one time. The same video server is used to send data for an 
entire GOP. However, different video servers can be utilized 
to send video/audio data for other GOPs since the compres- 
sion level for a GOP is independent of the compression 
levels used for other GOPs. 

Each of the N video servers 216 can comprise the video 
server 18 (FIG. 2) described previously or may comprise a 
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standard off the shelf video server such as the MPEG-2 
based Media Server from Oracle Inc. or the NetShow Server 
from Microsoft Corporation, Redmond, Wash. The standard 
video server must be suitably modified to provide a com- 
munication capability with the rate controller 222 before it 5 
will operate in the present invention. The modifications 
typically include providing a communication interface 
between the standard video server and the rate controller. 

The video client 220 functions similarly and is con- 
structed in similar fashion to that of the video client shown io 
in FIG. 3. The video client functions to decode and decom- 
press the video/audio data stream and sends the decoded/ 
decompressed video/audio stream to a display connected 
thereto. In addition, the video client 220 is adapted to issue 
the video file requests to the rate controller 222 rather than is 
to any of the video servers 1 through N. Throughout the 
video transmission session, the video client 220 functions to 
return acknowledgments and statistics to the rate controller 
222. The rate controller uses the acknowledgments and 
statistics returned by the video client 220 in order to calcu- 20 
late the optimum compression (resolution) level to use. 

An advantage of the adaptive video trasport system of 
FIG. 15 is that performance is enhanced. The performance 
enhancement is achieved in part by the use of standard video 
servers which are optimized for performance. Assuming 25 
each of the N video servers can generate 250 concurrent 512 
Kbps video/audio streams, the complete video server system 
is capable of generating up to 250xN concurrent 512 Kbps 
non-scaleable streams and an even higher number of con- 
current lower speed 28.8, 56, 128 or 256 Kbps streams. Note 30 
that whenever the initial bandwidth is known, for example 
within an Intranet, the video client will play back video 
directly from the most suitable server. When the initial 
bandwidth is unknown beforehand, the rate controller 222 
functions to determine the optimum bandwidth for the 35 
particular network connection Thus, the alternative embodi- 
ment shown in FIG. 15 can be utilized to implement the 
multi-compression layer and adaptive bandwidth measure- 
ment scheme of the present invention by piggy backing on 
existing standard video server technology and enhancing it 40 
to offer scaleable video transmission. 

While the invention has been described with respect to a 
limited number of embodiments, it will be appreciated that 
many variations, modifications and other applications of the 
invention may be made. 

What is claimed is: 

1. A method of transporting video over a network channel, 
comprising the steps of: 

compressing a raw video source into a plurality of frames, 5Q 
each frame comprising a plurality of levels, each level 
corresponding to a particular degree of compression; 

estimating the bandwidth of the network channel; 

selecting one of said plurality of levels of each frame to 
transmit over the network channel in accordance with 55 
said bandwidth estimate whereby the level selected 
optimizes the use of the bandwidth of the network 
channel; and 

sending said selected level of each frame over the network 
channel. 60 

2. The method according to claim 1, wherein said step of 
compressing comprises the step of compressing the raw 
video source into a plurality of different types of frames, 
each frame type containing different amount of video con- 
tent information, said plurality of different types of frames 65 
grouped so as to form a video stream consisting of a plurality 

of group of pictures (GOP) sequences. 
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3. The method according to claim 1, wherein said step of 
compressing comprises the step of compressing the raw 
video source into Key, P and B type frames, said Key, P and 
B frames generated so as to form a video stream consisting 
of a plurality of group of pictures (GOP) sequences. 

4. A method of transporting video from a video server to 
a video client over a network channel, comprising the steps 
of: 

compressing data from a raw video source so as to 
generate a plurality of frames, each frame being of a 
particular frame type, each frame type containing a 
particular amount of video content information, each 
frame comprising a plurality of levels, each level 
corresponding to a particular degree of compression; 

estimating the bandwidth of the network channel; 

determining the amount of video information waiting to 
be displayed at the video client; 

selecting one of said plurality of levels of each frame to 
send over the network channel in accordance with said 
bandwidth estimate whereby the level selected opti- 
mizes the use of the bandwidth of the network channel; 

choosing which frames having a particular frame type to 
send over the network channel in accordance with the 
amount of video information waiting to be displayed at 
the video client; and ' 

sending the chosen frames having a particular frame type 
and of said selected level over the network channel. 

5. A video server for transporting video from a video 
source over a network channel to a video client, said video 
source consisting of a plurality of frames of video data, each 
frame of video data consisting of multiple compression 
levels and being of a particular type, said video server 
comprising: 

receiver means for inputting frames of video data from the 
video source; 

sending means coupled to said receiver means, said 
sending means for determining which compression 
level within said frame and which frames having a 
particular type to transmit in accordance with the 
estimated available bandwidth of the network channel, 
said sending means for encapsulating said frames of 
video data into a plurality of packets for transmission 
over said network channel; and 

a controller for managing the operation of said receiver 
means and said sending means whereby the rate of 
transmission of said sending means is maintained so as 
to match the available bandwidth of the network chan- 
nel. 

6. The video server according to claim 5, wherein said 
sending means comprises: 

a rate control unit for measuring the available bandwidth 
of the network channel; 

a frame selector for inputting video frame data output by 
said receiver means, said frame selector outputting 
frames of a particular compression level in accordance 
with the bandwidth measured by said rate control unit; 

a packet generator for inputting video frame data output 
by said frame selector, said packet generator for encap- 
sulating said video frame data into a plurality of 
packets for transmission, said packet generator deter- 
mining which frames having a particular type are to be 
transmitted; 

a packet transmitter for placing onto the network channel 
the plurality of packets output by said packet generator; 
and 
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a receiver for receiving acknowledgments sent by the 
video client over the network channel in response to 
packets received thereby. 

7, A method of transporting video from a video server to 
a video client over a network channel, comprising the steps 
of: 

compressing data from a raw video source so as to 
generate a plurality of frames, each frame being of a 
particular frame type, each frame type containing a 
particular amount of video content information, each 
frame comprising a plurality of levels, each level 
corresponding to a particular degree of compression; 

estimating the bandwidth of the network channel; 

determining the amount of video information waiting to 
be displayed at the video client; 

selecting one of said plurality of levels of each frame to 
send over the network channel in accordance with said 
bandwidth estimate whereby the level selected optim- 
ies the use of the bandwidth of the network channel; 

choosing which frames having a particular frame type to 
send over the network channel in accordance with the 
amount of video information waiting to be displayed at 
the video client; 

sending the chosen frames of a type containing a higher 
amount of video data content and of a selected level 
over the network channel utilizing a reliable commu- 
nication protocol; and 

sending the chosen frames of a type containing a lower 
amount video data content and of a selected level over 
the network channel utilizing an unreliable communi- 
cation protocol. 

8. A video server system for transporting video from a 
plurality of video sources over a network channel to a video 
client, each video source consisting of a plurality of frames 
of video data, each frame of video data consisting of a single 
compression level and being of a particular type, said video 
server system comprising: 

a plurality of video servers, each video server associated 
with a single video source at a particular compression 
level, each video server comprising: 
receiver means for inputting frames of video data from 

the video source associated with that particular video 

server; 

sending means coupled to said receiver means, said 
sending means for determining which frames having 
a particular type to transmit in accordance with the 
available bandwidth of the network channel, said 
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sending means for encapsulating said frames of 
video data into a plurality of packets for transmission 
over said network channel; 

a controller for managing the operation of said receiver 
5 means and said sending means; and 

a rate controller for determining which video server to 
utilize for transmission of video data based on the 
available bandwidth of the network channel. 

9. The video server system according to claim 8, wherein 
10 -j j- • 

said sending means comprises: 

means for interfacing said video server to said rate 
controller; 

a bandwidth measurement unit for measuring the avail- 
15 able bandwidth of the network channel; 

a packet generator for inputting video frame data output 
by said receiver means, said packet generator for 
encapsulating said video frame data into a plurality of 
packets for transmission, said packet generator deter- 
20 mining which frames having a particular type are to be 
transmitted; 

a packet transmitter for placing onto the network channel 
the plurality of packets output by said packet generator; 

25 and 

a receiver for receiving acknowledgments sent by the 
video client over the network channel in response to 
packets received thereby. 

10. A method of transporting a video stream over a 
30 network channel, comprising the steps of: 

a) compressing a raw video source into a plurality of 
quality levels, each quality level corresponding to a 
particular degree of compression and having an asso- 
ciated group of pictures sequence; 
35 b) estimating the bandwidth of the network channel; 

c) selecting one of the plurality of quality levels in 
accordance with the bandwidth estimate, wherein the 
selected quality level optimizes the use of the band- 
width of the network channel during a first time inter- 

40 val; and 

d) sending the group of pictures sequence associated with 
the selected quality level over the network channel. 

11. The method of claim 10, wherein step a) comprises 
4S compressing the raw video source into a plurality of different 

types of frames, each frame type having video data corre- 
sponding to one of the plurality of quality levels. 

***** 
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